Building A Tool For Annotating Reference In Discourse

نویسندگان

  • Jonathan D. DeCristofaro
  • Michael Strube
  • Kathleen F. McCoy
چکیده

We discuss the development of a system for marking several types of reference to facilitate the analysis of reference in discourse. The tool is designed to be used in three applications: generating training data for machine learning of co-reference relations, evaluating theories of referring expression generation and resolution in texts, and developing theories for understanding reference in dialogs. The need to mark any of a broad set of relations which may span several levels of discourse structure drives the system architecture. The system has the ability to collect statistics over encoded relations and measure inter-coder reliability, and includes tools to increase the accuracy of the user’s markings by highlighting the discrepancies between two sets of markings. Using parsed corpora as the input further reduces the human workload and increases reliability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Annotating Discourse Relations with the PDTB Annotator

The PDTB Annotator is a tool for annotating and adjudicating discourse relations based on the annotation framework of the Penn Discourse TreeBank (PDTB). This demo describes the benefits of using the PDTB Annotator, gives an overview of the PDTB Framework and discusses the tool’s features, setup requirements and how it can also be used for adjudication.

متن کامل

Toward a Discourse Theory for Annotating Causal Relations in Japanese

We present a revised discourse theory based on segmented discourse representation theory and provide a method for building a Japanese corpus suitable for causal relation extraction. This extends and refines the framework proposed in Kaneko and Bekki (2014), and we evaluate our corpus and compare it with that work.

متن کامل

DialogueView: annotating dialogues in multiple views with abstraction

This paper describes DialogueView, a tool for annotating dialogues with utterance boundaries, speech repairs, speech act tags, and hierarchical discourse blocks. The tool provides three views of a dialogue: WordView, which shows the transcribed words time-aligned with the audio signal; UtteranceView, which shows the dialogue line-by-line as if it were a script for a movie; and BlockView, which ...

متن کامل

Annotating the Structure and Semantics of Fables

This paper outlines an annotation scheme we developed for a corpus of fables. Reference is made to previous studies on discourse structure and story grammar, as well as discourse relations and text coherence. The applicability and adequacy of the various frameworks for annotating and analysing fables are considered. The current work addresses several issues including the basic units for discour...

متن کامل

Annotating Subordinators in the Turkish Discourse Bank

In this paper we explain how we annotated subordinators in the Turkish Discourse Bank (TDB), an effort that started in 2007 and is still continuing. We introduce the project and describe some of the issues that were important in annotating three subordinators, namely karşın, rağmen and halde, all of which encode the coherence relation Contrast-Concession. We also describe the annotation tool.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999